135778 



PATENT APPLICATION 



CHANNEL SCHEDULING IN 
OPTICAL ROUTERS 
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15 FDL Buffers/' which is also incorporated by reference herein. 

[0003] This application is further related to U.S. Ser. No. 

(Attorney Docket 135779), filed concurrently herewith, entitled "Unified 
Associative Memory of Data Channel Schedulers in an Optical Router" to Zheng 
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concurrently herewith, entitled "Optical Burst Scheduling Using Partitioned 
Channel Groups" to Zheng et al. 

STATEMENT OF FEDERALLY SPONSORED RESEARCH OR DEVELOPMENT 
[0004] Not Applicable 
5 BACKGROUND OF THE INVENTION 

1. TECHNICAL FIELD 
[0005] This invention relates in general to telecommunications and, more 
particularly, to a method and apparatus for optical switching. 

ri 2. DESCRIPTION OF THE RELATED ART 

'% 10 [0006] Data traffic over networks, particularly the Internet, has increased 

f{ dramatically recently, and will continue as the user increase and new services 

Ipji requiring more bandwidth are introduced. The increase in Internet traffic 

'Aft 
ifci ! 

|4 requires a network with high capacity routers capable of routing data packets of 
variable length. One option is the use of optical networks. 

:? ., 

^ 15 [0007] The emergence of dense- wavelength division multiplexing (DWDM) 

#1 

p technology has improved the bandwidth problem by increasing the capacity of 
an optical fiber. However, the increased capacity creates a serious mismatch 
with current electronic switching technologies that are capable of switching data 
rates up to a few gigabits per second, as opposed to the multiple terabit per 

20 second capability of DWDM. While emerging ATM switches and IP routers can 
be used to switch data using the individual channels within a fiber, typically at a 
few hundred gigabits per second, this approach implies that tens or hundreds of 
switch interfaces must be used to terminate a single DWDM fiber with a large 
number of channels. This could lead to a significant loss of statistical 

25 multiplexing efficiency when the parallel channels are used simply as a collection 
of independent links, rather than as a shared resource. 
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[0008] Different approaches advocating the use of optical technology in place 
of electronics in switching systems have been proposed; however, the limitations 
of optical component technology has largely limited optical switching to facility 
management/ control applications. One approach, called optical burst-switched 
networking, attempts to make the best use of optical and electronic switching 
technologies. The electronics provides dynamic control of system resources by 
assigning individual user data bursts to channels of a DWDM fiber, while optical 
technology is used to switch the user data channels entirely in the optical 
domain. 

[0009] Previous optical networks designed to directly handle end-to-end user 
data channels have been disappointing. 

[0010] Therefore, a need has arisen for a method and apparatus for providing 
an optical burst-switched network. 



135778 



PATENT APPLICATION 



BRIEF SUMMARY OF THE INVENTION 

[0011] The present invention provides circuitry for scheduling data bursts in 
an optical burst-switched router. An optical switch routes optical information 
from an incoming optical transmission medium to one of a plurality of outgoing 

5 optical transmission media. A delay buffer coupled to the optical switch 
provides n different delays for delaying information between the incoming 
transmission medium and the outgoing transmission media. A scheduling 
circuit is associated with each outgoing medium; the scheduling circuits each 
comprise n+1 associative processors. Each associative processor includes 

10 circuitry for (1) storing scheduling information for the associated outgoing 

optical transmission medium relative to a respective one of the n delays and for 
zero delay and (2) identifying available time periods relative to the respective 
delays in which a data burst may be scheduled. 

[0012] The present invention provides a fast and efficient method for 
15 scheduling bursts in an optical burst-switched router. 



4 



135778 PATENT APPLICATION 

BRIEF DESCRIPTION OF THE SEVERAL VIEWS OF THE DRAWINGS 
[0013] For a more complete understanding of the present invention, and the 
advantages thereof, reference is now made to the following descriptions taken in 
conjunction with the accompanying drawings, in which: 

[0014] Figure la is a block diagram of an optical network; 

[0015] Figure lb is a block diagram of a core optical router; 

[0016] Figure 2 illustrates a data flow of the scheduling process; 

[0017] Figure 3 illustrates a block diagram of a scheduler; 

[0018] Figures 4a and 4b illustrate timing diagrams of the arrival of a burst 
header packet relative to a data burst; 

[0019] Figure 5 illustrates a block diagram of a DCS module; 

[0020] Figure 6 illustrates a block diagram of the associative memory of Pm ; 

[0021] Figure 7 illustrates a block diagram of the associative memory of Pg; 

[0022] Figure 8 illustrates a flow chart of a LAUC-VF scheduling method; 

[0023] Figure 9 illustrates a block diagram of a CCS module; 

[0024] Figure 10 illustrates a block diagram of the associative memory of Pt; 

[0025] Figure 11 illustrates a flow chart of a constrained earliest time method 
of scheduling the control channel; 

[0026] Figure 12 illustrates a block diagram of the path & channel selector; 

[0027] Figure 13 illustrates a example of a blocked output channel through 
the recirculation buffer; 
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[0028] Figure 14 illustrates a memory configuration for a memory of the BHP 
transmission module; 

[0029] Figure 15 illustrates a block diagram of an optical router architecture 
using passive FDL loops; 

5 [0030] Figure 16 illustrates an example of a path & channel scheduler with 
multiple Pm and Pg pairs; 

[0031] Figures 17a and 17b illustrate timing diagram of outbound data 
channels; 

Figure 18 illustrates clock signals for CLKf and CLK S ; 

Figure 19a and 19b illustrate alternative hardware modifications for 
operation of a router; 

Figure 20 illustrates a block diagram of an associative processor Pm; 

Figure 21 illustrates a block diagram of an associative processor Pg; 

Figure 22 illustrates a block diagram of an associative processor Pmg; 

Figure 23 illustrates a block diagram of an associative processor Pmg; 

[0038] Figure 24 illustrates a block diagram of an embodiment using multiple 
associative processors for fast scheduling; 

[0039] Figure 25 illustrates a block diagram of a processor PtA-ext for use with 
multiple channel groups; and 

20 [0040] Figure 26 illustrates a block diagram of a processor PG-ext for use with 
multiple channel groups. 
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DETAILED DESCRIPTION OF THE INVENTION 

[0041] The present invention is best understood in relation to Figures 1 - 26 of 
the drawings, like numerals being used for like elements of the various 
drawings. 

5 [0042] Figure la illustrates a general block diagram of an optical burst 
switched network 4. The optical burst switched (OBS) network 4 includes 
multiple electronic ingress edge routers 6 and multiple egress edge routers 8. 
The ingress edge routers 6 and egress edge routers 8 are coupled to multiple core 
optical routers 10. The connections between ingress edge routers 6, egress edge 
10 routers 8 and core routers 10 are made using optical links 12. Each optical fiber 
can carry multiple channels of optical data. 

[0043] In operation, a data burst (or simply "burst") of optical data is the 
basic data block to be transferred through the network 4. Ingress edge routers 6 
and egress edge routers 8 are responsible for burst assembly and disassembly 
15 functions, and serve as legacy interfaces between the optical burst switched 
network 4 and conventional electronic routers. 

[0044] Within the optical burst switched network 4, the basic data block to be 
transferred is a burst, which is a collection of packets having some common 
attributes. A burst consists of a burst payload (called "data burst") and a burst 

20 header (called "burst header packet" or BHP). An intrinsic feature of the optical 
burst switched network is that a data burst and its BHP are transmitted on 
different channels and switched in optical and electronic domains, respectively, 
at each network node. The BHP is sent ahead of its associated data burst with an 
offset time t (> 0) . Its initial value, r 0 , is set by the (electronic) ingress edge 

25 router 8. 
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[0045] In this invention, a "channel" is defined as a certain unidirectional 
transmission capacity (in bits per second) between two adjacent routers. A 
channel may consist of one wavelength or a portion of a wavelength (e.g., when 
time-division multiplexing is used). Channels carrying data bursts are called 
5 "data channels", and channels carrying BHPs and other control packets are 

called "control channels". A "channel group" is a set of channels with a common 
type and node adjacency. A link is defined as a total transmission capacity 
between two routers, which usually consists of a "data channel group" (DCG) 
and a "control channel group" (CCG) in each direction. 

10 [0046] Figure lb illustrates a block diagram of a core optical router 10. The 
incoming DCG 14 is separated from the CCG 16 for each fiber 12 by 
demultiplexer 18. Each DCG 14 is delayed by a fiber delay line (FDL) 19. The 
delayed DCG is separated into channels 20 by demultiplexer 22. Each channel 20 
is input to a respective input node on a non-blocking spatial switch 24. 

15 Additional input and output nodes of spatial switch 24 are coupled to a 
recirculation buffer (RB) 26. Recirculation buffer 26 is controlled by a 
recirculation switch controller 28. Spatial switch 24 is controlled by a spatial 
switch controller 30. 

[0047] CCGs 14 are coupled to a switch control unit (SCU) 32. SCU includes 
20 an optical/ electronic transceiver 34 for each CCG 14. The optical/ electronic 
transceiver 34 receives the optical CCG control information and converts the 
optical information into electronic signals. The electronic CCG information is 
received by a packet processor 36, which passes information to a forwarder 38. 
The forwarder for each CCG is coupled to a switch 40. The output nodes of 
25 switch 40 are coupled to respective schedulers 42. Schedulers 42 are coupled to a 
Path & Channel Selector 44 and to respective BHP transmit modules 46. The 
BHP transmit modules 46 are coupled to electronic/ optical transceivers 48. The 
electronic/ optical transceivers produce the output CCG 52 to be combined with 
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the respective output DCG 54 information by multiplexer 50. Path & channel 
selector 44 is also coupled to RB switch controller 28 and spatial switch controller 
30. 

[0048] The embodiment shown in Figure lb has N input DCG-CCG pairs and 
5 N output DCG-CCG pairs 52, where each DCG has K channels and each CCG has 
only one channel (k=l). A DCG-CCG pair 52 is carried in one fiber. In general, 
the optical router could be asymmetric, the number of channels k of a CCG 16 
could be larger than one, and a DCG-CCG pair 52 could be carried in more than 
one fiber 12. In the illustrated embodiment, there is one buffer channel group 
10 (BCG) 56 with R buffer channels. In general, there could be more than one BCG 
56. The optical switching matrix (OSM) consists of a (NK+R)x(NK+R) spatial 
switch and a RxR switch with WDM (wavelength division multiplexing) FDL 
buffer serving as recirculation buffer (RB) 26 to resolve data burst contentions on 
outgoing data channels. The spatial switch is a strictly non-blocking switch, 
4 15 meaning that an arriving data burst on an incoming data channel can be 

switched to any idle outgoing data channel. The delay A introduced by the input 
FDL 19 should be sufficiently long such that the SCU 32 has enough time to 
process a BHP before its associated data burst enters the spatial switch. 



[0049] The RxR RB switch is a broadcast-and-select type switch of the type 
20 described in P. Gambini, et al., "Transparent Optical Packet Switching Network 
Architecture and Demonstrators in the KEOPS Project", IEEE J. Selected Areas in 
Communications, vol. 16, no. 7, pp. 1245-1259, Sept. 1998. It is assumed that the 
RxR RB switch has B FDLs with the zth FDL introducing Q : delay time, 1 < i < B . 
It is further assumed without loss of generality that Q 1 <Q 2 <—<Q B and Q 0 = 0 , 
25 meaning no FDL buffer is used. Note that the FDL buffer is shared by all N input 
DCGs and each FDL contains R channels. A data burst entering the RB switch on 
any incoming channel can be delayed by one of B delay times provided. The 
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recirculation buffer in Figure lb can be degenerated to passive FDL loops by 
removing the function of RB switch, as shown in Figure 15, wherein different 
buffer channels may have different delays. 

[0050] The SCU is partially based on an electronic router. In Figure lb, the 
5 SCU has N input control channels and N output control channels. The SCU 
mainly consists of packet processors (PPs) 36, forwarders 38, a switching fabric 
40, schedulers 42, BHP transmission modules 46, a path & channel selector 44, a 
spatial switch controller 30, and a RB switch controller 28. The packet processor 
36, the forwarders 38, and the switching fabric 40 can be found in electronic 
10 routers. The other components, especially the scheduler, are new to optical 
, t fl routers. The design of the SCU uses the distributed control as much as possible, 

*J except the control to the access of shared FDL buffer which is centralized. 

71. :S 

fv! [0051] The packet processor performs layer 1 and layer 2 decapsulation 

m 

§4 functions and attaches a time-stamp to each arriving BHP, which records the 

15 arrival time of the associated data burst to the OSM. The time-stamp is the sum 

JT! of the BHP arrival time, the burst offset-time t carried by the BHP and the delay 

^ A introduced by input FDL 19. The forwarder mainly performs the forwarding 

I* table lookup to decide which outgoing CCG 52 to forward the BHP. The 
associated data burst will be switched to the corresponding DCG 54. The 

20 forwarding can be done in a connectionless or connection-oriented manner. 

[0052] There is one scheduler for each DCG-CCG pair 52. The scheduler 42 
schedules the switching of the data burst on a data channel of the outgoing DCG 
54 based on the information carried by the BHP. If a free data channel is found, 
the scheduler 42 will then schedule the transmission of the BHP on the outgoing 
25 control channel, trying to "resynchronize" the BHP and its associated data burst 
by keeping the offset time t (> 0) as close as possible to t 0 . After both the data 
burst and BHP are successfully scheduled, the scheduler 42 will send the 
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configuration information to the spatial switch controller 30 if it is not necessary 
to provide a delay through the recirculation buffer 26, otherwise it will also send 
the configuration information to the RB switch controller 28. 

[0053] The data flow of scheduling decision process is shown in Figure 2. In 
5 decision block 60, the scheduler 42 determines whether or not there is enough 
time to schedule an incoming data burst. If so, the scheduler determines whether 
the data burst can be scheduled, i.e., whether there is an unoccupied space in the 
specified output DCG 54 for the data burst. In order to schedule the data burst, 
there must be an available space to accommodate the data burst in the specified 
10 output DCG. This space may start within a time window beginning at the point 
ife :;| of arrival of the data burst at the spatial switch 24 extending to the maximum 

delay which can be provided by the recirculation buffer 26. If the data burst can 
%l be scheduled, then the scheduler 42 must determine whether there is a space 
p available in the output CCG 52 for the BHP in decision block 64. 

f f 15 [0054] If any of the decisions in decision blocks 60, 62 or 64 are negative, the 
|;* data burst and BHP are dropped in block 65. If all of the decisions in decision 

4} blocks 60, 62 and 64 are positive, the scheduler sends the scheduling information 

O 

:^ to the path and channel selector 44. The configuration information from 

scheduler to path & channel selector includes incoming DCG identifier, incoming 
20 data channel identifier, outgoing DCG identifier, outgoing data channel 

identifier, data burst arrival time to the spatial switch, data burst duration, FDL 
identifier i ( Q. delay time is requested, 0 < i < B ). 

[0055] If the FDL identifier is 0, meaning no FDL buffer is required, the path 
& channel selector 44 will simply forward the configuration information to the 
25 spatial switch controller 30. Otherwise, the path & channel selector 44 searches 
for an idle incoming buffer channel to the RB switch 26 in decision block 68. If 
found, the path and channel selector 44 searches for an idle outgoing buffer 
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channel from the RB switch 26 to carry the data burst reentering the spatial 
switch after the specified delay inside the RB switch 26 in decision block 70. It is 
assumed that once the data burst enters the RB switch, it can be delayed for any 
discrete time from the set { Q l , Q 2 , . . ., Q B }. If this is not the case, the path & 
5 channel selector 44 will have to take the RB switch architecture into account. If 
both idle channels to and from the RB switch 26 are found, the path & channel 
selector 44 will send configuration information to the spatial switch controller 30 
and the RB switch controller 28 and send an ACK (acknowledgement) back to 
the 42 scheduler. Otherwise, it will send a NACK (negative acknowledgement) 
10 back to the scheduler 42 and the BHP and data burst will be discarded in block 
65. 

' f, [0056] Configuration information from the path & channel selector 44 to the 

%i 

$4 spatial switch controller 30 includes incoming DCG identifier, incoming data 

"H-i 

fij channel identifier, outgoing DCG identifier, outgoing data channel identifier, 

V! 

^ „ 15 data burst arrival time to the spatial switch, data burst duration, FDL identifier i 
| 4 ( Q t delay time is requested, 0<i<B). If i > 0 , the information also includes the 

Ji ' i incoming BCG identifier (to the RB switch), incoming buffer channel identifier 
u (to the RB switch), outgoing BCG identifier (from the RB switch), and outgoing 

13 

!hl buffer channel identifier (from the RB switch). 

20 [0057] Configuration information from path & channel selector to RB switch 
controller includes an incoming BCG identifier (to the RB switch), incoming 
buffer channel identifier (to the RB switch), outgoing BCG identifier (from the RB 
switch), outgoing buffer channel identifier (from the RB switch), data burst 
arrival time to the RB switch, data burst duration, FDL identifier i ( Q t delay time 

25 is requested, 1 < i < B ). 

[0058] The spatial switch controller 30 and the RB switch controller 28 will 
perform the mapping from the configuration information received to physical 
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components that involved in setting up the internal path(s), and configure the 
switches just-in-time to let the data burst fly-through the optical router 10. When 
the FDL identifier is larger than 0, the spatial switch controller will set up two 
internal paths in the spatial switch, one from the incoming data channel to the 
5 incoming recirculation buffer channel when the data burst arrives to the spatial 
switch, another from the outgoing buffer channel to the outgoing data channel 
when the data burst reenters the spatial switch. Upon receiving the ACK from 
the path & channel selector 44, the scheduler 42 will update the state information 
of selected data and control channels, and is ready to process a new BHP. 

[0059] Finally, the BHP transmission module arranges the transmission of 
BHPs at times specified by the scheduler. 

[0060] The above is the general description on how the data burst is 
scheduled in the optical router. Recirculating data bursts through the RxR 
recirculation buffer switch more than once could be easily extended from the 
design principles described below if so desired. 

[0061] Figure 3 illustrates a block diagram of a scheduler 42. The scheduler 42 
includes a scheduling queue 80, a BHP processor 82, a data channel scheduling 
(DCS) module 84, and a control channel scheduling (CCS) module 86. Each 
scheduler needs only to keep track of the busy/ idle periods of its associated 
outgoing DCG 54 and outgoing CCG 52. 

[0062] BHPs arriving from the electronic switch are first stored in the 
scheduling queue 80. For basic operations, all that is required is one scheduling 
queue 80, however, virtual scheduling queues 80 may be maintained for different 
service classes. Each queue 80 could be served according to the arrival order of 
25 BHPs or according to the actual arrival order of their associated data bursts. The 
BHP processor 82 coordinates the data and control channel scheduling process 
and sends the configuration to the path & channel selector 44. It could trigger the 

13 
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DCS module 84 and the CCS module 82 in sequence or in parallel, depending on 
how the DCS and CCS modules 84 and 82 are implemented. 

[0063] In the case of serial scheduling, the BHP processor 82 first triggers the 
DCS module 84 for scheduling the data burst (DB) on a data channel in a desired 
5 output DCS 54. After determining when the data burst will be sent out, the BHP 
processor then triggers the CCS module 86 for scheduling the BHP on an 
associated control channel. 

[0064] In the case of parallel scheduling, the BHP processor 82 triggers the 
DCS module 84 and CCS module 86 simultaneously. Since the CCS module 86 
10 does not know when the data burst will be sent out, it schedules the BHP for all 
^ possible departure times of the data burst or its subset. There are in total B+l 

*| possible departure times. Based on the actual data burst departure time reported 

a 

%l from the DCS module 84, the BHP processor 86 will pick the right time to send 

m 

m out the BHP. 

$ 

IP? 

J 4 15 [0065] Slotted transmission is used in data and control channels between edge 

| s and core and between core nodes in the OBS network. A slot is a fixed-length 

'IJ time period. Let 7^ be the duration (e.g., in [is) of a time slot in data channels and 

irk T f be the duration of a time slot in control channels. T i ■ r d Kbits of information 

can be sent during a slot if the data channel speed is r d gigabits per second. 
20 Similarly, T f ■ r c Kbits of information can be sent during a slot if the control 

channel speed is r c gigabits per second. Two scenarios are considered, (1) r c - r d 
and (2) r c ^r d . In the latter case, a typical example is that r c = r d I A (e.g., OC-48 is 
used in control channels and OC-192 is used in data channels). 

[0066] Without loss of generality, it is assumed that T f is equal to multiples of 
25 T s . Two examples are depicted in Figures 4a and 4b (see also, Figure 18), which 
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illustrates the timestamp and burst offset-time in a slotted transmission system 
for the cases where T f = T s and T f = AT S , with the initial offset time t 0 = 87^. . To 

simplify the description, we use timeframe to designate time slot in control 
channels. It is further assumed without loss of generality that, (1) data bursts are 
variable length, in multiple of slots, which can only arrive at slot boundaries, and 
(2) BHPs are also variable length, in, for instance, multiple of bytes. Fixed-length 
data bursts and BHPs are just special cases. In slotted transmission, there is some 
overhead in each slot for various purposes like synchronization and error 
detection. Suppose the frame payload on control channels is P f bytes, which is 

less than (7} • r c ) ■ 1 000/8 bytes, the total amount of information can be 

transmitted in a time frame. 

[0067] The OSM is configured periodically. For slotted transmission on data 
channels, a typical example of the configuration period is one slot, although the 
configuration period could also be a multiple of slots. Here it is assumed that the 
OSM is configured every slot. The length of a FDL Q. needs also to be a multiple 

of slots, 1 < i < B . Due to the slotted transmission and switching, it is suggested to 
use the time slot as a basic time unit in the SCU for the purpose of data channel 
scheduling, control channel scheduling and buffer channel scheduling, as well as 
synchronization between BHPs and their associated data bursts. This will 
simplify the design of various schedulers. 

[0068] The following integer variables are used in connection with Figures 4a, 
4b and 5: 

t BHP : the beginning of a time frame during which the BHP enters the 

SCU; 

t m : the arrival time of a data burst (DB) to the optical switching 
matrix (OSM); 

l DB : the duration/ length of a DB in slots; 
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A : delay (in slots) introduced by input FDL 
t : burst offset-time (in slots). 

[0069] Each arriving BHP to the SCU is time-stamped at the transceiver 
interface, right after O/E conversion, recording the beginning of the time frame 
5 during which the BHP enters the SCU. For the BHPs received by the SCU in the 
same time frame, they will have the same timestamp t BHP . For scheduling 
purpose, the most important variable is t m , the DB arrival time to the OSM. 
Suppose a b-bit slot counter is used in the SCU to keep track of time, t DB can be 
calculated as follows. 

[0070] t m = (t BHP ■ T f + A + t) mod 2 b . (1) 

[0071] Timestamp t DB will be carried by the BHP within the SCU 32. Note 
that the burst offset-time r is also counted starting from the beginning of the time 
frame that the BHP arrives as shown in Figures 4a-b, where in Figure 4a, t BHP = 9 
and r = 6 slots, and in Figure 4b, t BHP = 2 and r = 7 slots. Suppose A = 100 slots, 
we have ^=115, meaning that the DB will arrive at slot boundary 115. In 
Figures 4a-b, 1 < r < t 0 = 8 . It is assumed without loss of generality that the 
switching latency of the spatial switch in Figure lb is negligible. So the data burst 
arrival time t DB to the spatial switch 24 is also its departure time if no FDL buffer 
is used. Note that even if the switching latency is not negligible, t m can still be 
used as the data burst departure time in channel scheduling as the switching 
latency is compensated at router output ports where data and control channels 
are resynchronized. 

[0072] Figure 5 illustrates a block diagram of a DCS module 84. In this 
embodiment, associative processor arrays Pm 90 and Pg 92 perform parallel 
25 searches of unscheduled channel times and gaps between scheduled channel 
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times and update state information. Gaps and unscheduled times are 
represented in relative times. Pm 90 and Pg 92 are coupled to control processor 
CPi 94. In one embodiment, a LAUC-VF (Latest Available Unused Channel with 
Void Filling) scheduling principle is used to determine a desired scheduling, as 
5 described in connection with U.S. Ser. No. 09/689,584, entitled "Hardware 

Implementation of Channel Scheduling Algorithms of Optical Routers with FDL 
Buffers" to Zheng et al, filed October 12, 2000, and which is incorporated by 
reference herein. 

[0073] The DCS module 84 uses two b-bit slot counters, C and Ou Counter C 
10 keeps track of the time slots, which can be shared with the CCS module 86. 
,pj Counter G records the elapsed time slots since the last BHP is received. Both 
J* counters are incremented by every pulse of the slot clock. However, counter Ci is 

f\ reset to 0 when the DCS module 84 receives a new BHP. Once counter Ci reaches 

' ! s| ■ 

[4 2 b -l, it stops counting, indicating that at least 2 b -l slots have elapsed since the last 

|rf 

|4 15 BHP. The value of b should satisfy 2 b >W S where W s is the data channel 
§4 scheduling window. W s =t 0 +A + Q b + Z max - 8 , where i max is the maximum 
ifLf length of a DB and 8 is the minimum delay of a BHP from O/E conversion to 

the scheduler 42. Assuming that r 0 = 8, A = 1 20, Q B = 32, L = 64 , and 8 = 40 , then 

W s =184 slots. In this case, b = 8 bits. 

20 [0074] Associative processor Pm in Figure 5 is used to store the unscheduled 
time of each data channel in a DCG. Let t, be the unscheduled time of channel 
H j which is stored in z'th entry of Pm 0 <i< K - 1 . Then from slot t, onwards, 
channel H t is free, i.e., nothing being scheduled, t, is a relative time, with respect 
to the time slot that the latest BHP is received by the scheduler. Pm has an 

25 associative memory of 2K words to store the unscheduled times t, and channel 
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identifiers, respectively. The unscheduled times are stored in descending order. 
For example, in Figure 6 we have K = 8 and t 0 >t l >t 2 >t 3 >t i > t 5 >t 6 >t 7 . 

[0075] Similarly, associative processor Pg in Figure 5 is used to store the gaps 
of data channels in a DCG. We use / ; and r ] to denote the start time and ending 

time of gap j , 0 < j < G - 1 , which are also relative times. This gap is stored inyth 
entry of Pg and its corresponding data channel is H . Pg has an associative 

memory of G words to store the gap start time, gap ending time, and channel 
identifiers, respectively. Gaps are also stored in the descending order according 
to their start times / . For example, Figure 7 illustrates the associative memory of 

Pg, where l 0 >l x >l 2 >...> l G _ 2 > l G _ y . G is the total number of gaps that can be 

stored. If there are more than G gaps, the newest gap with larger start time will 
push out the gap with the smallest start time, which resides at the bottom of the 
associative memory. Note that if / =0, then there are in total; gaps in the DCG, 



[0076] Upon receiving a request from the BHP processor to schedule a DB 
with departure time t DB and duration l DB , the control processor (CPi) 94 first 
records the time slot t sch during which it receives the request, reads counter Ci 
(t e <— C, ) and reset Ci to 0. Using t sch as a new reference time, the CPi then 
calculates the DB departure time (no FDL buffer) with respect to t sch as 



as l J+l -l J+2 



G-l 



= 0. 



^ = (^-^+2*) mod 2", 



(2) 



In the meantime, CPi updates Pm using 



t, =max(0, t, ~t e ), 



0<i<K-l 



(3) 



and updates Pg using the following formulas, 
/ = max(0, / -t e ), 0 < j < G-l 



(4) 
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and 

r, =max(0,r,-f e ), 0<y<G-l. (5) 

[0077] After the memory update, CPi 94 arranges the search of eligible 
outgoing data channels to carry the data burst according to the LAUC-VF 
method, cited above. The flowchart is given in Figure 8. In block 100, and index i 
is set to "0". In block 102, Pg finds a gap in which to transmit the data burst 
t'DB+Qi- In blocks 106, Pm finds an unscheduled channel in Pm to transmit the 
data burst at t'DB+Qi. Note that the operations of finding a gap in Pg to transmit 
the DB at time f DB +Q i and finding an unscheduled time in Pm to transmit the DB 
at time t' DB +Q l are preferably performed in parallel. The operation of finding a 
gap in Pg to transmit the data burst at time t' DB +Q l (block 102) includes parallel 
comparison of each entry in Pg with (t' DB +Q,,t' DB +Q, + l m ). If t' DB +Q, > I and 
t' DB +Q, + l DB ^ Tj , the response bit of entry; returns 1, otherwise it returns 0, 
0 < j < G - 1 . If at least one entry in Pg returns 1, the gap with the smallest index 
is selected. 

[0078] The operation finding an unscheduled time in Pm to transmit the DB at 
time t' DB +Q I (block 106) includes parallel comparison of each entry in Pm with 

f DB +Q, . If t' DB +Q, > tj , the response bit of entry ; returns 1, otherwise it returns 0, 
0 < j < K - 1 . If at least one entry in Pm returns 1, the entry with the smallest 
index is selected. 

[0079] If the scheduling is successful in decision blocks 104 or 108, then the 
CPi will inform the BHP processor 82 of the selected outgoing data channel and 
the FDL identifier in blocks 105 or 109, respectively. After receiving an ACK from 
the BHP processor 82, the CPi 94 will update P G 90 or P M 94 or both. If 
scheduling is not successful, i is incremented in block 110, and Pm and Pg try to a 
time to schedule the data burst at a different delay. Once Q : reaches the 
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maximum delay (decision block 112), the processors Pm and Pg report that the 
data burst cannot be scheduled in block 114. 

[0080] To speed up the scheduling process, the search can be performed in 
parallel. For example, if B=2 and three identical Pm's and Pg's are used, as shown 
in Figure 5, one parallel search will determine whether the data burst can be sent 
out at times f DB , t' DB +Q 1 , and f DB +Q 2 . The smallest time is chosen in case that 
the data burst can be sent out at different times. In another example, if B=5 and 
three identical Pm's and Pg's are used, at most two parallel searches will 
determine whether the DB can be scheduled. 

[0081] Some simplified versions of the LAUC-VF methods are listed below 
which could also be used in the implementation. First, an FF-VF (first fit with 
void filling) method could be used wherein the order of unscheduled times in Pm 
and gaps in Pg are not sorted in a given order (either descending or ascending 
order), and the first eligible data channel found is used to carry the data burst. 
Second, a LAUC (latest available unscheduled channel) method could be used 
wherein Pg is not used, i.e., no void filling is considered. This will further 
simplify the design. Third, a FF (first fit) method could be used. FF is a 
simplified version of FF-VF where no void filling is used. 

[0082] The block diagram of the CCS module 86 is shown in Figure 9. Similar 
to the DCS module 84, associative processor Pt 120 keeps track of the usage of 
the control channel. Since a maximum of P f bytes of payload can be transmitted 

per frame, memory T 121 of Pt 120 tracks only the number of bytes available per 
frame (Figure 10). Relative time is used here as well. The CCS module 86 has two 
bi-bit frame counters, C f and C{ . C f counts the time frames. C{ records the 
elapsed frames since the receiving of the last BHP. Upon receiving a BHP with 
arrival time t DB , CP2 122 timestamps the frame during which this BHP is 
received, i.e., t[ ch <- C f . In the meantime, it reads counter C( (t f e <- C{ ) and 
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114 



n 



reset C{ to 0. It then updates the Pt by shifting B i 's down by t{ positions, i.e., 
B._ f{ = B„ t{ < z < 2* 1 - 1 , and B, = P f for 2 h - 1{ < i < 2 h - 1 . In the initialization, 

all the entries in Pt are set to P f . Next, CP2 calculates the frame t J DB during which 
the data burst will depart (assuming FDL Q t is used) using 
5 4(a) = L((^+a)mod2 ft )/rJ, 0<i<B, (6) 

where T f is the frame length in slots. The relative time frame that the DB will 
depart is calculated from 

^(a) = (4(a)-4+2*')mod2\ 0</<2. (7) 

[0083] The parameter bi can be estimated from parameter b, e.g., 2 hl =2 b IT f . 

D 

t i 10 When b=8 and T f =4, bi-6. The following method is used to search for the 

If 

i! possible BHP departure time for a given DB departure time t (e.g., t = f f DB (Q t ) ). 

p| The basic idea is to send the BHP as earlier as possible, but the offset time should 
14 be no larger than t 0 (as described in connection with Figures 4a and 4b). Let 

For example, when r 0 = 8 slots and T f = 1 slot, J = 8 . When r 0 = 8 



f. 1 15 slots and T f =4 slots, J = 2. Suppose the BHP length is X bytes. 



[0084] In the preferred embodiment, a constrained earliest time (CET) method 
is used for scheduling the control channel, as shown in Figure 11. In step 130, Pt 
120 performs a parallel comparison of X (i.e., the length of a BHP) with the 
contents Bt-j of relevant entries of memory T 121 , E; , 0 < j < J - 1 and t- j > 0 . If 

20 X < B t _j, entry E t _ } returns 1, otherwise it returns 0. In step 132, if at least one 

entry in Pt returns 1, the entry with the smallest index is chosen in step 134. The 
index is stored and the CCS module 86 reports that a frame to send the BHP has 
been found. If no entry in Pt returns a "1" , then a negative acknowledgement is 
sent to the BHP processor 82. (step 136) 
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[0085] The actual frame t f that the BHP will be sent out is 
(*db ~J + 2* 1 )mod2 Al if E t is chosen. The new burst offset-time is 
(t DB modT f ) + j-T f . 



[0086] After running the CET method, the CCS module 86 sends the BHP 
5 processor 82 the information on whether the BHP can be scheduled and in which 
time frame it will be sent. Once it gets an ACK from the BHP processor 82, the 
CCS module 86 will update Pt. For example, if the content in entry y needs to be 
updated, then B <- B y -X .If the BHP cannot be scheduled, the CCS module 86 

will send a NACK to the BHP processor 82. In the real implementation, the 
|;3 10 contents in Pt do not have to move physically. A pointer can be used to record 
the entrv index associated with the reference time frame 0. 

•y 

[0087] For parallel scheduling, as discussed below, since the CCS module 86 
? ; does not know the actual departure time of the data burst, it schedules the BHP 

s for all possible departure times of the data burst or a subset and reports the 

§4 

|4 15 results to the BHP processor 82. When B=2, there are three possible data burst 

fii 

^ departure times, f DB , t' DB +Q ] and t' DB +Q 2 . Like the DCS module 84, if three 

fi 

|* identical Pt s are used, as shown in Figure 9, one parallel search will determine 
whether the BHP can be scheduled for the three possible data burst departure 
times. 



20 [0088] A block diagram of the path & channel selector 44 is shown in Figure 
12. The function of the path & channel selector 44 is to control the access to the 
RxR RB switch 26 and to instruct the RB switch controller 28 and the spatial 
switch controller 30 to configure the respective switches 26 and 24. The path & 
channel selector 44 includes processor 140 coupled to a recirculation-buffer-in 

25 scheduling (RBIS) module 142, a recirculation-buffer-out scheduling (RBOS) 
module 144 and a queue 146. The RBIS module 142 keeps track of the usage of 
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the R incoming channels to the RB switch 26 while the RBOS module 144 keeps 
track of the usage of the R outgoing channels from the RB switch 26. Any 
scheduling method can be used in RBIS and RBOS modules 142 and 144, e.g., 
LAUC-VF, FF-VF, LAUC, FF, etc. Note that RBIS module 142 and RBOS module 
5 144 may use the same or different scheduling methods. From manufacturing 
viewpoint, it is better that the RBIS and RBOS module use the same scheduling 
method as the DCS module 84. Without loss of generality, it is assumed here that 
the LAUC-VF method is used in both RBIS and RBOS modules 142 and 144; thus, 
the design of DCS module can be reused can be used for these modules. 

10 [0089] Assuming a data burst with duration l DB arrives to the OSM at time 
*| t DB and requires a delay time of Q t . The processor 140 triggers the RBIS module 

*| 142 and RBOS module 144 simultaneously. It sends the information of t DB and 
% % L a to the RBIS module 142, and the information of time-to-leave the OSM 

f 1 ( tm + Qi ) and 1 db to the RBOS module 144. The RBIS module 142 searches for 
•If 15 incoming channels to the RB switch 26 which are idle for the time period of 
fk ^DB^DB+hs)-^ there are two or more eligible incoming channels, the RBIS 

module will choose one according to LAUC-VF. Similarly, the RBOS module 144 
H searches for outgoing channels from the RB switch 26 which are idle for the time 
period of ( t DB + Q f , t DB +l DB +Q,). If there are two or more eligible outgoing 
20 channels, the RBOS module 144 will choose one according to LAUC-VF. The 
RBIS (RBOS) module sends either the selected incoming (outgoing) channel 
identifier or NACK to the processor. If an eligible incoming channel to the RB 
switch 26 and an eligible outgoing channel from the RB switch 26 are found, the 
processor will send back ACK to both RBIS and RBOS module which will then 
25 update the channel state information. In the meantime, it will send ACK to the 
scheduler 42 and the configuration information to the two switch controllers 28 
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and 30. Otherwise, the processor 140 will send NACK to the RBIS and RBOS 
modules 142 and 144 and a NACK to the scheduler 42. 

[0090] The RBOS module 144 is needed because the FDL buffer to be used by 
a data burst is chosen by the scheduler 42, not determined by the RB switch 26. It 
is therefore quite possible that a data burst can enter the RB switch 26 but cannot 
get out of the RB switch 26 due to outgoing channel contention. An example is 
shown in Figure 13, where three fixed-length data bursts 148a-c arrive to the 2x2 
RB switch 26. The first two data bursts 148a-b will be delayed 2D time while the 
third DB will be delayed D time. Obviously, these three data bursts will leave the 
switch at the same time and contend for the two outgoing channels. The third 
data burst 148c is lost in this example. 

[0091] The BHP transmission module 46 is responsible for transmitting the 
BHP on outgoing control channel 52 in the time frame determined by the BHP 
processor 82. Since the frame payload is fixed, equal P f , in slotted transmission, 
one possible implementation is illustrated in Figure 14, where the whole memory 
is divided into W c segments 150 and BHPs to be transmitted in the same time 
frame are stored in one segment 150. W c is the control channel scheduling 
window, which equals to 2 bl . There is a memory pointer per segment (shown in 
segment Wo, pointing to the memory address where a new BHP can be stored. To 
distinguish BHPs within a frame, the frame overhead should contain a field 
indicating the number of BHPs in the frame. Furthermore, each BHP should 
contain a length field indicating the packet length (e.g., in bytes), from the first 
byte to the last byte of the BHP. 

[0092] Suppose t c is the current time frame during which the BHP is received 
by the BHP transmission module and p c points to the current memory segment. 
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Given the BHP departure time frame tj , the memory segment to store this BHP 
is calculated from (p c + (f, -t c + 2 h ) mod 2* 1 ) mod 2" 1 . 

[0093] Figure 15 shows the optical router architecture using passive FDL 
loops 160 as the recirculation buffer, where the number of recirculation channels 
5 R = R l +R 2 +... + R B , with ;th channel group introducing Q } delay time, \ <j<B. 
Here the recirculation channels are differentiated while in Figure lb all the 
recirculation channels are equivalent, able to provide B different delays. The 
potential problem of using the passive FDL loops is the higher block probability 
of accessing the shared FDL buffer. For example, suppose B = 2 , R = 4 and 
10 R x =2, R 2 =2, and currently two recirculation channels of R x are in use. If a new 

j| DB needs to be delayed by Q l time, it may be successfully scheduled in Figure 
lb as there are still two idle recirculation channels. However, it cannot be 

m scheduled in Figure 15, since the two channels able to delay Q are busy. 

^ [0094] The design of the SCU 32 is almost the same as described previously, 

Y 15 except for the following changes: (1) the RBOS module 144 within the path & 
M channel selector 44 (see Figure 12) is no longer needed, (2) slight modification is 

required in the RBIS module 142 to distinguish recirculation channels if B > 1 . 
m To reduce the blocking probability of accessing the FDL buffer when B > 1 , the 

scheduler is required to provide more than one delay option for each databurst 
20 that needs to be buffered. The impact on the design of scheduler and path & 

channel selector 44 is addressed below. Without loss of generality, it is assumed 

in the following discussion that the scheduler 42 has to schedule the databurst 

and the BHP for B+l possible delays. 

[0095] The design of DCS module 84 shown in Figure 5 remains valid in this 
25 implementation. The search results could be stored in the format shown in Table 
1 (assuming B=2), where the indicator (1/0) indicates whether or not an eligible 
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data channel is found for a given delay, say Q { . The memory type (0/1) indicates 

Pm or Pg. The entry index gives the location in the memory, which will be used 
for information update later on. The channel identifier column gives the 
identifiers of the channels found. The DCS module then passes the indicator 
column and the channel identifier column (only those with indicator 1) to the 
BHP processor. 

Table 1 : Stored search results in DCS module (5=2). 



Go 

a 

Q 2 



Indicator 
(1 bit) 


Memory type 
(lbit) 


Entry Index 
Max( log 2 G , log 2 K ) bits 


Channel identifier 
( log 2 K bits) 



























[0096] The design of CCS module 86 shown in Figure 9 also remains valid. 
The search results could be stored in the format shown in Table 2 (assuming 
£5=2), where the indicator (1/0) indicates whether or not the BHP can be 
scheduled on the control channel for a given DB departure time. The entry index 
gives the location in the memory, which will be used for information update later 
on. The "frame to send BHP" column gives the time frames in which the BHP are 
scheduled to send out. The CCS module then passes the indicator column and 
the "frame to send BHP" column (only those with indicator 1) to the BHP 
processor. 



Qo 



Table 2: Stored search results in CCS module (B=2). 



Indicator 
(1 bit) 


Entry Index 
(bi bits) 


Frame to send BHP 
(fcjbits) 





















[0097] After comparing the indicator columns from the DCS and CCS 
modules, the BHP processor 82 in Figure 3 knows whether the data burst and its 
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BHP can be scheduled for a given FDL delay Q l , \ <i<B and determines which 
configuration information will be sent to the path & channel selector 44 in Figure 
12. The three possible scenarios are, (1) the data burst can be scheduled without 
using FDL buffer, (2) the data burst can be scheduled via using FDL buffer, and 
(3) the data burst cannot be scheduled. 

[0098] In the third case, the data burst and its BHP are simply discarded. In 
the first case, the following information will be sent to the path & channel 
selector: incoming DCG identifier, incoming data channel identifier, outgoing 
DCG identifier, outgoing data channel identifier, data burst arrival time to the 
spatial switch, data burst duration, FDL identifier 0 (i.e. Q 0 ). The path & 
channel selector 44 will immediately send back an ACK after receiving the 
information. In the second case, the following information will be sent to the 
path & channel selector: 

■ incoming DCG identifier, 

■ incoming data channel identifier, 

■ number of candidate FDL buffer x, 

■ for (z'=l to x do) 

• outgoing DCG identifier, 

• outgoing data channel identifier, 

• FDL identifier i, 

■ data burst arrival time to the spatial switch, 

■ data burst duration. 

[0099] In the second scenario, the path & channel selector 44 will search for an 
idle buffer channel to carry the data burst. The RBIS module 142 is similar to the 
one described in connection with Figure 12, except that now it has a Pm and Pg 
pair for each group of channels with delay Q t , 1 < z < B . An example is shown in 
Figure 16 for B=2, as an example. With one parallel search, the RBIS module will 
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know whether the data burst can be scheduled. When x=l, the RBIS module 142 
performs parallel search on (Pmi 90a, Pgi 92a) or (Pm2 90b, Pg2 92b), depending on 
which FDL buffer is selected by the BHP processor 82. If an idle buffer channel is 
found, it will inform the processor 140, which in turn sends an ACK to the BHP 
5 processor 82. When x=2, both (Pmi, Pgi) and (Pm2, Pg2) will be searched. If two 
idle channels with different delays are found, the channel with delay Q x is 
chosen. In this case, an ACK together with the information that Q x is chosen will 
be sent to the BHP processor 82. After a successful search, the RBIS module 142 
will update the corresponding Pm and Pg pair. 

10 [00100] Figures 17 - 26 illustrate variations of the LAUC-VF method, cited 

* I above. In the LAUC-VF method cited above, two associative processors Pm and 

i l Pg are used to store the status of all channels of the same outbound link. 

% 4 Specifically, Pm stores r words, one for each of the r data channels of an 

if:'l 

lift outbound link. It is used to record the unscheduled times of these channels. Pg 

' 15 contains n superwords, one for an available time interval (a gap) of some data 
channel. The times stored in Pm and Pg are relative times. Pm and Pg support 

111 associative search operations, and data movement operations for maintaining the 

iJ 

I J times in a sorted order. Due to parallel processing, Pm and Pg are used as major 
components to meet stringent real-time channel scheduling requirement. 

20 [00101] In the embodiment described in Figures 22-23, a pair of associative 
processors Pm and Pg for the same outbound link are combined into one 
associative processor Pmg- The advantage of using a unified Pmg to replace a pair 
of Pm and Pg is the simplification of the overall core router implementation. In 
terms of ASIC implementation, the development cost of a Pmg can be much lower 

25 than that of a pair of Pm and Pg- Pmg can be used to implement a simpler 
variation of the LAUC-VF method with faster performance. 
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[00102] In Figures 17a and 17b, two outbound channels Chi and Chi are shown, 
with to being the current time. With respect to to, channel Chi has two DBs, DBi 
and DBi, scheduled and channel Chi has DB3 scheduled. The time between DBi 
and DBi on Chi, which is a maximal time interval that is not occupied by any DB, 
5 is called a gap. The times labeled h and h are the unscheduled time for Chi and 
Chi, respectively. After h and h, Chi and Chi are available for transmitting any 
DB, respectively. 

[00103] The LAUC-VF method tries to schedule DBs according to certain 
priorities. For example, suppose that a new data burst DB4 arrives at time t'. For 

10 the situation of Figure 17a, DB4 can be scheduled within the gap on Chi, or on Chi 
after the unscheduled time of Chi. The LAUC-VF method selects Chi for DB4, and 
two gaps are generated from one original gap. For the situation of Figure 17b, 
DB4 conflicts with DBi on Chi and conflicts with DB3 on Chi. But by using FDL 
buffers, it may be scheduled for transmission without conflicting DBs on Chi 

15 and/or DBs on Chi. Figure 17b shows the scheduling that DB4 is assigned to On, 
and a new gap is generated. 

[00104] Assuming that an outbound link has r data channels, the status of this 
link can be characterized by two sets: 

Sm = {(U, i) I U is the unscheduled time for channel Chi} 
20 Sg = {(lj, rj, q) I lj < tj and the interval [lj, r } ] is a gap on channel Chq} 

[00105] In the embodiment of LAUC-VF proposed in U.S. Ser. No. 09/689,584, 
the two associative processors Pm and Pg were proposed to represent Sm and Sg, 
respectively. Due to fixed memory word length, the times stored in the 
associative memory M of Pm and the associative memory G of Pg are relative 
25 times. Suppose the current time is to. Then any time value less than to is of no use 
for scheduling a new DB. Let 

S m= {(max{f; - to, 0}, i) | (U , i) e Sm} 
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S'g = {(max{lj - to, 0}, maxft - to, 0}, q) | (I,, r } , cj) e Sc} 
The times in S'm and S'g are times relative to the current time to, which is used as 
a reference point 0. Thus, M of Pm and G of Pg are actually used to store S'm and 
S'g respectively. 

5 [00106J The channel scheduler proposed inU.S. Ser. No. 09/ 689,584 assumes 
that DBs have arbitrary lengths. One possibility is to assume a slot transmission 
mode. In this mode, DBs are transmitted in units of slots, and BHPs are 
transmitted as groups, and each group is carried by a slot. A slot clock CLK S is 
used to determine the slot boundary. The slot transmissions are triggered by 
£l 10 pulses of CLK S . Thus, the relative time is represented in terms of number of CLK S 
% cycles. The pulses of CLK S are shown in Figure 18. In addition to clock CLK S/ there 
!»f is another finer clock CLKf. The period of CLK S is a multiple of the period CLK/. In 
19 the example shown in Figures 18, one CLK S cycle contains sixteen CLKf cycles. 
§4 Clock CLKf is used to coordinate operations performed within a period of CLK S . 

I 15 [00107] In Figures 19a and 19b, modifications to the hardware design of Pm 
J 'J* and Pg given in U.S. Ser. No. 09/689,584 are provided for accommodation of slot 

CI transmissions. In Pm, there is an associative memory M of r words. Each word Mi 

of M is essentially a register, and it is associated with a subtractor 200. A register 
MC holds an operand. In the embodiment of Figure 19a, the value stored in MC 
20 is the elapsed time since the last update of M. The value stored in MC is 

broadcast to all words Mi, 1 < i < r. Each word Mi does the following: Mj <- Mj - 
MC if Mj > MC; otherwise, Mi<-0. This operation is used to update the relative 
times stored in M. If MC stores the elapsed time since last time parallel 
subtraction operation is performed, performing this operation again updates 
25 these times to the time relative to the time when this new PARALLEL- 
SUBTRACTION is performed. Another operation is the parallel comparison. In 
this operation, the value stored in MC is broadcast to all words M h l<i<r. Each 
word Mi does the following: if MC > Mi then MFLAGi = 1, otherwise MFLAd = 0. 
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Signals MFLAd, l<i<r, are transformed into an address by a priority encoder. 
This address and the word with this address are output to the address and data 
registers, respectively, of M. This operation is used to find a channel for the 
transmission of a given DB. Similarly, two subtracters are used for a word, one 
5 for each sub- word, of the associative memory G in Pg. 



[00108] An alternative design, shown in Figure 19b, is to implement each word 
Mi in M as a decrement counter with parallel load. The counter is decremented 
by 1 by every pulse of the system slot clock CLK S . The counting stops when the 
counter reaches 0, and the counting resumes once the counter is set to a new 
10 positive value. Suppose that at time to the counter's value is f ' and at time U > to 
"\ the counters value is t" . Then t" is the same time of t' , but relative to U , i.e. t" = 

k $ max{f ' - (h - to), 0}. Note that any negative time (i.e. t' - (ti - to) < 0) with the new 

*il 

%l reference point fi is not useful in the lookahead channel scheduling. Associated 

§! . 

j/| with each word Mi is a comparator 204. It is used for the parallel comparison 

ink 

* 15 operation. Similarly, a word of G in Pg can be implemented by two decrement 
counters with two associated comparators. 

fjl 

4 1 [00109] The system has a obit circular increment counter C s . The value of C s is 
I , incremented by 1 by every pulse of slot clock CLK S . Let thtenaj (BHPi) be the time, 

in terms of number of S'g cycles, between the time BHPi is received by the router 
20 and the time BHPi is received by the channel scheduler. The value c is chosen 

such that: 

> ~ MAX, 

where MAX S is the number of CLKf cycles within a CLK S cycle. When BHPi is 
received by the router, BHPi is timestamped by operations timestamp T eco(BHPi) <- 
25 C s . When BHPi is received by the scheduler of the router, BHPi is timestamped 
again by timestamp S ch(BHPi) <- C s . Let 

Di = (timestamprecu(BHPi) + 2 C - Hmestampsch(BHP t )) mod 2 C . 
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Then, the relative arrival time (in terms of slot clock CLK S ) of DB t at the optical 
switching matrix of the router is Ti=A+t ! +D !/ where nis the offset time between 
BHPi and DB y and D is the fixed input FDL time. Using the slot time at which 
timestamp S ch(BHPi) <— C s is performed as reference point, and the relative times 
5 stored in Pm and Pg, DBi can be correctly scheduled. 

[00110] In the hardware implementation of LAUC-VF method, associative 
processors Pm and Pg are used to store and process S'm and S'g, respectively. At 
any time, S'm - {(U ,i)\l<i<r} and S'g = {(/;, rj, c) \ l } > 0}. A pair (U , i) in S'm 
represents the unscheduled time on channel Chi, and a triple (lj, rj, q) in S'g 
iQ 10 represents a time gap (interval) [lj, rj] on channel Chq. The unscheduled time U 
"I can be considered as a semi-infinite gap (interval) [U , <x>]. Thus, by including such 
*f semi-infinite gaps into S'g, S'm is no longer needed. 

111 [00111] More specifically, let S"m = {(U , <x>, i) \ (U , i) e S'm}, and define S'mg = 

H 

{f S"m u S'g. The basic idea of combining Pm and Pg is to build Pmg by modifying 

| 4 15 Pg so that Pmg is used to process S'mg- We present the architecture of associative 
I processor Pmg for replacing Pm and Pg. Pmg uses an associative memory MG to 

'Is? 

* store pairs in S'm and triples in S'g. As G in Pg, each word of MG has two sub- 

words, with the first one for lj and second one for rj when it is used to store (lj, r,, 
q). When a word of MG is used to store a pair (ti,i) of S'm, the first sub- word is 
20 used for U , and the second is left unused. The first r words are reserved for S'm, 
and the remaining words are reserved for S'g. The first r words are maintained in 
non-increasing order of their first sub-word. The remaining words are also 
maintained in non-increasing order of their first subword. New operations for 
Pmg are defined. 

25 [00112] Below, the structures and operations of Pm and Pg are summarized, 
and the structure and operations of Pmg are defined. The differences between 
Pmg include the number of address registers used, the priority encoders, and 
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operations supported. It is shown that Pmg can be used to implement the LAUC- 
VF method without any slow-down, in comparison with the implementation 
using Pm and Pg. 

[00113] The outbound data channel of a core router has r channels 
5 (wavelengths) for data transmission. These channels are denoted by Chi, Chz, . . , 
Ch r . Let S = [U 1 1 < i < r}, where U is the unscheduled time for channel Ch t . In other 
words, at any time after U , channel Chi is available for transmission. Given a time 
T, Pm is an associative processor for fast search of T" = mm{ti 1 U > Tj, where T is 
a given time. Suppose that V = t p then channel Chj is considered as a candidate 
10 data channel for transmitting a DB at time T. 

'% [00114] For purposes of illustration, the structures of Pm and Pg are shown in 

%i Figures 20 and 21 and Pmg is shown in Figure 22. 

fj'i [00115] An embodiment of Pm 210 is shown in Figure 20. Associative processor 

§4 

j Pm includes an associative memory M 212 of k words, Mi, Mi, Mk, one for 

^ 15 each channel of the data channel group. Each word is associated with a simple 
subtraction circuit for subtraction and compare operations. The words are also 

P connected as a linear array. Comparand register MC 214 holds the operand for 

Mi 

comparison. MCH 216 is a memory of k words, MCHi, MCHi, . . ., MCHk, with 
MCHj corresponding to Mj. The words are connected as a linear array, and they 
20 are used to hold the channel numbers. MARi 218 and MARi 220 are address 

registers for holding addresses for accessing M and MCH. MDR 222 and MCHR 
224 are data registers used to access M and MCHR along with the MARs. 

[00116] Associative processor Pm supports the following major operations that 
are used in the efficient implementation of the LAUC-VF channel scheduling 
25 operations: 

RANDOM-READ: Given address x in MARi, do MDRj <- M x , MCHR 

+-MCH X . 
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RANDOM-WRITE: Given address x in MARi, do M x <r- MDR, MCH X 

<- MCHR. 

PARALLEL-SEARCH: The value of MC is compared with the values 
of all word Mi, Mz, .., Mk simultaneously (in parallel). Find the smallest; such 
5 that Mj < MC, and do MARi <- j, MDRi <- Mj, and MCHR <- MCH,. If there does 
not exist any word Mj such that M, <MC, MARi = 0 after this operation. 

SEGMENT-SHIFT-DOWN: Given addresses a in MARi, and b in 
MARi such that a < b, perform M/+i <- Mj and MCHj+i <- MCH,- for all a<j<b. 

[00117] For RANDOM-READ, RANDOM-WRITE and SEGMENT-SHIFT- 
CflO DOWN operations, each pair (Mj, MCHj) is treated as a superword. The output 
jj of PARALLEL-SEARCH consists r binary signals, MFLA&, l<i<r. MFLAd = 1 

0 

Q if and only if M, < MC. There is a priority encoder with MFLAG t , 1 < i < r, as 

iff! 

r J input, and it produces an address / and this value is loaded into MARi when 

* " PARALLEL-SEARCH operation is completed. RANDOM-READ, RANDOM- 

U 15 WRITE, PARALLEL-SEARCH and SEGMENT-SHIFT-DOWN operations are 

flj used to maintain the non-increasing order of values stored in M. 

f ; [00118] Figure 21 illustrates a block diagram of the associative processor Pg 92. 

in* 

A Pg is used to store unused gaps of all channels of an outbound link of a core 
router. A gap is represented by a pair (I, r) of integers, where I and r are the 
20 beginning and the end of the gap, respectively. Associative processor Pg includes 
associative memory G 93, comparand register GC 230, memory GCH 232, address 
register GAR 234, data registers GDR 236 and GCHR 238 and working registers 
GRi 240 and GR 2 242. 

[00119] G is an associative memory of n words, Gi, Gi,..., G n , with each Gi 
25 consisting of two sub-words G t ,i and G,,2. The words are connected as a linear 
array. GC holds a word of two sub-words, GO and GC2. GCH is a memory of n 
words, GCHi, GCH2, GCH n with GCHj corresponding to Gj. The words are 
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connected as a linear array, and they are used to hold the channel numbers. GAR 
is an address register used to hold address for accessing G. GDR, and GCHR are 
data registers used to access M and MCHR, together with GAR. 

[00120] Associative processor Pg supports the following major operations that 
5 are used in the efficient implementation of the LAUC-VF channel scheduling 
operations: 

RANDOM-WRITE: Given address x in GAR, do G x ,i <^GDRi, G x , 2 <- 
GDR 2 , GCH X <r- GCHR. 

PARALLEL-DOUBLE-COMPARAND-SEARCH: The value of GC is 
10 compared with the values of all word G1,G2, ...,Gn simultaneously (in parallel). 
,pi Find the smallest; such that Gj,i < GCi and G/,2 > GC2. If this operation is 
f successful, then do GDRi <- G j>h GDR 2 <- G ])2 , GCHR <- GCHj , and GAR <- ;; 
* ' otherwise, GAR <- 0. 

ipi PARALLEL-SINGLE-COMPARAND-SEARCH: The value of GCi is 

I s 15 compared with the values of all word G1,G2, . . .,G n simultaneously (in parallel). 
J 4 Find the smallest; such that Gj,i > GO and ;' in a register GAR. If this operation is 
t* successful, then do GDRi <- Gj, h GDR 2 <- G j/2/ GCHR <- GCHj , and GAR <- ;; 

# otherwise, GAR <— 0. 

f% 

{.I BIPARTITION-SHIFT-UP: Given address a in GAR, shift the content 

20 of G/+i to Q <- G;+i, GCHy <- GCH/n, GCH ; to GCH ;+ i for a <;' <n, and G„, 2 <- 0, 
G„, 2 <- 0. 

BIPARTITION-SHIFT-DOWN: Given address a in GAR, do G j+ i <- 
G; , GCHj+i <— GCHj, a<j <n. 

[00121] In Pg, a triple (Gi,i,Gi r2 ,GCHi) corresponds to a gap with beginning time 
25 Gi,i and ending time on channel GCH t . For RANDOM-WRITE, PARALLEL- 
DOUBLE-COMPARAND-SEARCH, PARALLEL-SINGLE-COMPARAND- 
SEARCH, BIPARTITION-SHIFT-UP, and BIPARTITION-SHIFT-DOWN 
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operations, each triple (Gi,i r Gi2,GCHi) is treated as a superword. The output of 
PARALLEL-DOUBLE-COMPARAND-SEARCH (resp. PARALLEL-SINGLE- 
COMPARAND-SEARCH) operation consists n binary signals, GFLAd ,l<i< n, 
such that GFLAGi = 1 if and only if Gi,j >GQ and G/,2 < GC2 (resp. Gi,i > GQ). 
5 There is a priority encoder with GFLAGt, 1< i < n, as input, and it produces an 
address j and this value is loaded into GARi when the operation is completed. 
RANDOM-WRITE, PARALLEL-SINGLE-COMPARAND-SEARCH, 
BIPARTITION-SHIFT-UP, and BIPARTITION-SHIFT-DOWN operations 
maintain the non-increasing order of values stored in Gi,is. 

10 [00122] The operations of Pm and Pg are discussed in greater detail in U.S. Ser. 
No. 09/689,584. 

[00123] Figure 22 illustrates a block diagram of a processor Pmg, which 
combines the functions of the Pm and Pg processors described above. Pmg 
f s includes associative memory MG 248, comparand register MGC 250, memory 
* 15 MGCH 252, address registers MGARi 254a and MGAR 2 254b, and data registers 
U MGDR 256 and MGCHR 258. 



iS4 
IteJ 



j [00124] MG is an associative memory of m - r + n words, MGi, MG2,. . .,MG m , 
with each MGi consisting of two sub-words MGi,i and MGy. The words are also 
connected as a linear array. MGC is a comparand register that holds a word of 

20 two sub-words, MGCi and MGC2. MGC also holds a word of two sub-words, 
MGCi and MGC 2 . MGCH is a memory of m words, MGCH h MGCH 2 ,. ..,MGCH m 
with MGCHj corresponding to MGj. The words are connected as a linear array, 
and they are used to hold the channel numbers. 

[00125] Associative processor Pmg supports the following major operations: 
25 RANDOM-READ: Given address x in MGARi, do MGDRi <- MG;,i, 

MGDR 2 <- MG/,2, MCHR 4- MGCHx. 

RANDOM-WRITE: Given address x in MGAR, do MG x ,i ^- MGDRi, 
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MG X ,2 «- MGDR 2 , MGCHx <r- MGCHR. 

PARALLEL-COMPOUND-SEARCH: In parallel, the value of MGCi 
is compared with the values of all superwords MGi, l<i<m, and the values of 
MGCi and MGC2 are compared with all super words MG P r +1 <j < m, in parallel. 
5 (i) If MGCi #0, then do the following in parallel: Find the smallest;' such that ; 7 < 
r and MGy,l < MGCi. If this search is successful, then do MGARi <- /; otherwise, 
MGARi <- 0. Find the smallest ;" such that r + 1 < j" < m, MG;,i < MGCi and MG h i 
> MGC 2 . If this search is successful, then do MGARi <-j" and MGCHR <- 
MGCH; otherwise MGARi <- 0. (ii) If MGCi = 0, then find the smallest such 
10 that 1 < j' < m and MGy,l < MGCi. MGj,i > MGd. If this search is successful, then 
Q do MGARi 4r-j" and MGCHR <- MGCHi; otherwise MGARi <- 0. 
* * BIPARTITION-SHIFT-UP: Given address a in MGARi, do MG, <- 

N MG ;+ i, MGCHj <r-MGCHj+i, MGCH, to MGCH;+i for a <;< m, and MG„, 3 4- 0, MG„, 2 

iyi <- 0- 

if 15 SEGMENT-SHIFT-DOWN: Given addresses a in MGARi, and & in 

f k MGARi such that a < b, perform MG,+i <- MG/ and MGCH; + i <- MGCH/ for all a < 
^ ] < b. 

[00126] As in Pg, a triple (MQ,i, MGi,2, MGCHi) may correspond to a gap with 
beginning time MGi,i and ending time MG/,2 on channel MGCH. But in such a 

20 case, it must be that i > r. If i < r, then MG;,2 is immaterial, the pair (MG;,i, 

MGCHi) is interpreted as the unscheduled time MGi,i on channel MGCH, and 
this pair corresponds to a word in P M . For RANDOM-READ, RANDOM- 
WRITE, PARALLEL-COMPOUND-SEARCH, BIPARTITION-SHIFT-UP and 
SEGMENT-SHIFT-DOWN operations, each triple (MG/,2, MGi, 2 , . . ., MGCH) is 

25 treated as a superword. The first r superwords are used for storing the 

unscheduled times of r outbound channels, and the last m-r superwords are 
used to store information about gaps on all outbound channels. 
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[00127] The output of PARALLEL-COMPOUND-SEARCH operation consists 
of binary signals MGFLAGi whose values are defined as follows: (i) if MGCi = 0 
and MGi,i > MGCi then MGFLAGi = 1; (ii) if MGC 2 #0, i < r, MGi,i > MGCi then 
MGFLAGi = 1; (iii) if MGCi * 0, i> r, MG if i > MGCi and MG i/2 < MGC 2 then 
5 MGFLAGi = 1, or if MG ir i > MGCi and i < r then MGFLAGi = 1; and (iv) otherwise, 
MGFLAGt=Q. 

[00128] There are two encoders. The first one uses MGFLAGi, 1 < i < r, as its 
input, and it produces an address in MGARi after a PARALLEL-COMPOUND- 
SEARCH operation is performed if MGC 2 #0. The second encoder uses 
MGFLAGi, r + 1 < i < m, as its input. It produces an address in MGAR2 after a 
PARALLEL-COMPOUND-SEARCH operation is performed if MGCi *0. There is 
a selector with the output of the two encoders as its input. If MGC2 = 0, the 
smallest non-zero address produced by the two encoders, if such an address 
exists, Ys loaded into MGARi after a PARALLEL-COMPOUND-SEARCH 
operation is performed; otherwise, MGARi is set to 0 after a PARALLEL- 
COMPOUND-SEARCH operation is performed; If MGC2 ^0, the output of the 
selector is disabled. 

[00129] RANDOM-READ, RANDOM-WRITE, PARALLEL-COMPOUND- 
SEARCH1, BIPARTITION-SHIFT-UP and SEGMENT-SHIFT-DOWN operations 
are used to maintain the non-increasing order of values stored in MGi,i of the first 
m words, and the non-increasing order of the values stored in MGi,i of the last m - 
r words. 

[00130] The operations of associative processors Pm and Pg can be carried out 
by operations of Pmg without any delay when they are used to implement 
25 LAUC-VF channel scheduling method. We assume that Pmg contains m = r + n 
superwords. In Table 3 (resp. Table 4), the operations of Pm (resp. Pg) given in 
the left column are carried out by operations of Pmg given the right column. 
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Instead of searching Pm and Pg concurrently, using Pmg, this step can be carried 
out by PARALLEL-COMPOUND-SEARCH operation. 



Table 3: Simulation of Pm by Pmg 



Pm 


Pmg 


RANDOM-READ 


RANDOM-READ 


RANDOM-WRITE 


RANDOM-WRITE 


PARALLEL-SEARCH 


PARALLEL-COMPOUND- 
SEARCH 


SEGMENT-SHIFT- 


SEGMENT-SHIFT-DOWN 


DOWN 


(with_MGAR 2 = m) 



Table 4: Simulation of Pg by Pmg 



Pg 


Pmg 


RANDOM-WRITE 


RANDOM-WRITE 


PARALLEL-DOUBLE-COMPARAND- 
SEARCH 


PARALLEL-COMPOUND- 
SEARCH 


PARALLEL-SINGLE-COMPARAND- 
SEARCH 


PARALLEL-COMPOUND- 
SEARCH (with MGC 2 = 0) 


BIPARTITE-SHIFT-UP 


SEGMENT-SHIFT-UP 
(withMGAR 2 = m) 


BIPARTITE-SHIFT-DOWN 


SEGMENT-SHIFT-DOWN 
(with MGAR 2 = m-l) 



[00131] In the L AUC-VF method, fitting a given DB into a gap is preferred, 
even the DB can be scheduled on another channel after its unscheduled time/ as 
shown by the example of Figures 17a-b. With separate Pm and Pg and performing 
search operations on Pm and Pg simultaneously, this priority is justifiable. 
However, the overall circuit for doing so may be considered too complex. 

[00132] By combining Pm and Pg into one associative processor, simpler and 
faster variations of this LAUC-VF methods are possible. An alternative 
embodiment is shown in Figure 23. In this figure, processor Pmg 270 includes an 
array TYPE 272 with m bits, each bit being associated with a corresponding word 
in memory MG. If TYPEi = 1 then MG : stores an item of S'm otherwise, MG t stores 
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an item of S'g. Further, register TYPER 274 is a one-bit register used to access 
TYPE, together with MGARi and MGAR 2 . 

[00133] Other differences between P*mg and Pmg include the priority encoder 
used and the operations supported. When a new DB is scheduled, MG is 
searched. The fitting time interval found, regardless if it is a gap or a semi- 
infinite interval, will be used for the new DB. Once the DB is scheduled, one 
more gap may be generated. As long as there is sufficient space in MG, the new 
gap is stored in MG. When MG is full, an item of S'g may be lost. But it is 
enforced that all items of S'm must be kept. 

[00134] Let t s ° ut (DBi) and t e 0Ut (DBi) be the transmitting time of the first and last 
slot of DBi at the output of the router, respectively. Then 

t s ™ t (DB l )=T l +Lj 

and 

te^HDBi) = Ti + Lj+ length(DBi), 
where Ti is the relative arrival time defined above, L } is the FDL delay time 
selected for DB l in the switching matrix and length(DB2) is the length of DBi in 
terms of number of slots. Assume that there are q+1 FDLs Lo, Li,.. ., Lq in the DB 
switching matrix such that Lq=0<Li <Li <...< L q -i <L q . The new variation of 
LAUC-VF is sketched as follows: 

method CHANNEL-SCHEDULING 
begin 

success <r- 0; 
for; = 0 to q do 

MGCi <-T,+L, 

MGC 2 <- T, + Lj + length(DBi); 

perform PARALLEL-COMPOUND-SEARCH using Pmg; 

ifMGAR2^0then 

if MGARi & 0 then 

begin 

output MGCHR as the number of the channel for transmitting DBi 
output Lj as the selected FDL delay time for DB Vr 
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update MG of P*mg using the values in MGCi and MGd 
success <— 1; 

exit /* exit the for-loop */ 
end 

5 endfor 

if success = 0 then drop DBi/* scheduling for DB; is failed */ 
end 

[00135] Once a DB is scheduled, MG is updated. When a gap is to be added 
into MG, and TYPE m = 1, the new gap is ignored. This ensures that no item 
10 belonging to S'm is lost. 

[00136] Associative processor P*mg supports the following major operations: 
RANDOM-READ: Given address x in MGARi, do MGDRi <r-MGi,i, 
0 MGDR 2 <- MGi,2, MCHR <- MGCH X , TYPER <-TYPE x . 

H RANDOM-WRITE: Given address x in MGARi, do MG x ,i <- MGDRi, 

•fil5 MG X/2 <- MGDRi, MGCH X <- MGCHR, TYPE X <- TYPER. 

fj PARALLEL-COMPOUND-SEARCH: The value of MGCi is compared 

1W with the values of all superwords MG V 1 < i < m, and MGQ are compared with 
H all super words Md, 1 < i <m, whose TYPEi = 0, in parallel. Find the smallest/ 
|| such that TYPE/- = 1 and MGf,j < MGC 2/ or TYPEj = 0, MG jA < MGCi and MG jr2 > 
YaO MGC 2 . If this search is successful, then do MGARi <- j 7 , TYPER <- TYPE/; MGCH 
^- MGCHj'; otherwise, otherwise MGARj <- 0. 

BIPARTITION-SHIFT-UP, SEGMENT-SHIFT-DOWN: same as in 

Pmg. 

[00137] In operation, The value of TYPEi indicates the type of information 
25 stored in MGi. As in Pg, a triple (MGi,i, MGi,i, MGCHi) may correspond to a gap 
with beginning time MGi,i and ending time MG x ,i on channel MGCHi. But in such 
a case, it must be that TYPEi = 0. If TYPE, = 1, then MGi,2 is immaterial, the pair 
(MGi,i, MGCHi) is interpreted as the unscheduled time MG x ,i on channel MGCHi, 
and this pair corresponds to a word in Pm. For RANDOM-READ, RANDOM- 
30 WRITE, PARALLEL-COMPOUND-SEARCH, BIPARTITION-SHIFT-UP and 
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SEGMENT-SHIFT-DOWN operations, each quadruple (MGi,i, MGi, 2 , TYPEi, 
MGCHi) is treated as a superword. 

[00138] The output of PARALLEL-COMPOUND-SEARCH operation consists 
of binary signals MGFLAGi whose values are defined as follows. If MGCi^Q, 
5 TYPE = 0, G ir i > GCi and G lf2 < GC 2 then MGFLA& = 1. If MGC 2 = 0 and G iA > GCi 
then MGFLAGi = 1. Otherwise, MGFLAG; = 0. There is a priority encoders. If 
MGFLAGi, 1 < i < m, as its input, and it produces an address in MGARi after a 
PARALLEL-COMPOUND-SEARCH operation is performed. 

[00139] RANDOM-READ, RANDOM-WRITE, PARALLEL-COMPOUND- 
#10 SEARCH, BIPARTITION-SHIFT-UP and SEGMENT-SHIFT-DOWN operations 

'} are used to maintain the non-increasing order of values stored in MG;,2S. 

W 

j? [00140] Figure 24 illustrates the use of multiple associative processors for fast 
Vf scheduling. Channel scheduling for an OBS core router is very time critical, and 
?1 multiple associative processors (shown in Figure 24 as P mg processors 270), 
1,1 15 which are parallel processors, are proposed to implement scheduling methods. 
Suppose that there are q + 1 FDLs Lo = 0, Lj, . . ~L q in the DB switching matrix 

■ft? 

P such that Lo< Li< ... < L q . These FDLs are used, when necessary, to delay DBs 
and increase the possibility that the DBs can be successfully scheduled. In the 
implementation of the LAUC-VF method presented in U.S. Ser . No.09 / 689,584, 
20 the same pair of Pm and Pg are searched repeatedly using different FDLs until a 
scheduling solution is found or all FDLs are exhausted. The method CHANNEL- 
SCHEDULING described above uses the same idea. 

[00141] To speed up the scheduling, a scheduler 42 may use q + 1 Pm/Pg pairs, 
one for each h. At any time, all q + 1 Ms have the same content, all q + 1 MCHs 
25 have the same content, all q + 1 Gs have the same content, and all q + 1 GCHs 
have the same content. Then finding a scheduling solution for all different FDLs 
can be performed on these Pm/Pg pairs simultaneously. At most one search result 
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is used for a DB. All Pm/Pg pairs are updated simultaneously by the same lock- 
step operations to ensure that they store the same information. Similarly, one 
may use q + 1 Pmgs or P*mgs to speed up the scheduling. 

[00142] In Figure 24, a multiple processor system 300 uses q + 1 P*mgs 270 
5 implement the method CHANNEL-SCHEDULING described above. Similarly, 
the LAUC-VF method can be implemented using multiple Pm/Pg pairs, or 
multiple Pmgs in a similar way to achieve better performance. The multiple P mgs 
270 include q + 1 associative memories MG°, MG 1 ,.. ., MG?. Each MO has m 
words MQi, MGh, ■ ■ ■, MO m , with each MG/j consisting of two sub-words MO 1,1 

10 and MOi,2. There are q + 1 comparand registers MGC°, MGO, MGO. Each 
MGO holds a word of two sub-words, MGO'i and MGO 2. MGCHs: There are q + 
1 associative memories MGCH 0 , MGCH 1 , .. ., MGCHi. Each MGCHi has m 
words, MGCHi, MGCH 2,.. ., MGCHi m . The words in MGCH are connected as a 
linear array. There are q + 1 linear arrays TYPE 0 , TYPE 1 ,..., TYPEi, where TYPB 

15 has m bits, TYPB1JYPB2, . . .,TYPB m . MGARi, MGAR are address registers used 
to hold address for accessing MG and MGCH. MGDR, TYPER, MGCH are: data 
registers used to access MGs, TYPEs and MGCHR. 

[00143] This multiple processor system 300 supports the following major 
operations: 

20 RANDOM-READ: Given address x in MGARi, do MGDRi <- MG\i, 

MGDR2 <r- MG\ 2 , MCHR <- MGCH° X/ TYPER <- TTPE°x. 

RANDOM-WRITE: Given address x in MGARi, do MO x ,i <- MGDRi, 
MQ X ,2 MGDR 2 , MGCHix <- MGCHR, TYPEJ X <- TYPER, for 0 <; <^. 

PARALLEL-COMPOUND-SEARCH: For 0 </' <^, the value of MGOj 
25 is compared with the values of all superwords MGh, l<i<m, and MGO2 are 
compared with all super words MGh, l<i<m, whose TYPEJi = 0, in parallel. For 
0 < j < q, find the smallest kp such that TYPB k j,i = 1 and MGty,i < MGOi, or 
TYPBkj. = 0, MG%i < MGOj and MGt,, 2 > MGO' 2 . If this search is successful, let lj 
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= 1; otherwise let lj = 0. Find FD = min{/ 1 lj = 1,0 <j< q}. If such / exists, then do; 
<- FD, MGARi <r- k p TYPER <- TYPB k] . r MGCH <r- MGCHhj.; otherwise,. otherwise 
MGARi <r- 0. 

BIPARTITION-SHIFT-UP: Given address a in MGARi, for 0 <j <q, do 

5 MGh <r-MGm MGCHh <- MGCHh+i, MGCHh to MGCHh+i, for a<i<m, and MG/„,2 
<-0,MG/„,2<-0. 

SEGMENT-SHIFT-DOWN: Given addresses a in MGARi, and b in 
MGARi such that a < b, for 0 </' < q do MQ, <- MG, and MGCHh+i <~ MGCHh for 
all a < fc. 

10 [00144] A RANDOM-READ operation is performed on one copy of P mg, i.e. 
ll MG°, TYPE 0 , and MGO. RANDOM-WRITE, PARALLEL-COMPOUND- 
1 1 SEARCH, BIPARTITION-SHIFTUP and SEGMENT-SHIFT-DOWN operations 
JJ are performed on all copes of P mg. For RANDOM-READ, RANDOM-WRITE, 
l\ PARALLEL-COMPOUND-SEARCH, BIPARTITION-SHIFT-UP and SEGMENT- 
f 15 SHIFT-DOWN operations, each quadruple (MGi,i, MGi,i, TYPE t , MGCHi) is 
f * treated as a superword. When a PARALLEL-COMPOUND-SEARCH operation 
is performed, the output of all P*mg copies are the input of selectors. The output 
IS? of one P*mg copy is selected. 

[00145] The CHANNEL-SCHEDULING method may be implemented in the 
20 multiple processor system as: 

method PARALLEL-CHANNEL-SCHED ULING 
begin 

success <— 0; 

for; = 0 to q do in parallel 

25 ' MGOi <r- Ti+Lj 

MGO2 <- Ti+Lj+ length(DBi); 
endfor 

perform PARALLEL-COMPOUND-SEARCH; 
if MGARi ^Othen 
30 begin 

output MGCHR as the number of the channel for transmitting DB2; 
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output L, as the selected FDL delay time for DBi 
k<FD; 

for j = 0 to q do in parallel 

update MQ, 0 <j <q, using the values in MGCDR k i and 

5 MGDR k 2 

endfor 

success <- 1; 

end 

if success = 0 then drop DBi/* scheduling for DB t is failed */ 
10 end 

[00146] It may be desirable to be able to partition the r data channels into 
groups and choose a particular group to schedule DBs. Such situations may 
occur in several occasions. For example, one may want to test a particular 
channel. In such a situation, the channel to be tested by itself forms a channel 

15 group, and all other channels form another group. Then, channel scheduling is 
only performed on the 1-channel group. Another occasions is that during the 
operation of the router, some channels may fail to transmit DBs. Then, the 
channels of the same outbound link can be partitioned into two groups, the 
group that contains all failed channels, and the group that contains all normal 

20 channels, and only normal channels are to be selected for transmitting DBs. 
Partitioning data channels also allows channel reservation, which has 
applications in quality of services. Using reserved channel groups, virtual 
circuits and virtual networks can be constructed. 

[00147] To incorporate group partition feature into channel scheduling 
25 associative processors, the basic idea is to associate a group identifier (or gid for 
short) with each channel. For a link, all the channels share the same gid belong to 
the same group. The gid of a channel is programmable; i.e. it can be changed 
dynamically according to need. The gid for a DB can be derived from its BHP 
and/ or some other local information. 

30 [00148] The design of Pm and Pg to PM-ext and PG-ext may be extended to 
incorporate multiple channel groups, as shown in Figures 25 and 26, 
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respectively. As shown in Figure 25, associative processor Pwi-ext 290 includes M, 
MC, MCH, MARi, MAR 2 , MDR, MCHR, as described in connection with Figure 
20. MCIDC 292 is a comparand register that holds the gid for comparison. MGID 
294 is a memory of r words, MGlDi, MGID 2,.. ., MGID r , with MGID } 
5 corresponding to M, and MCH,. The words are connected as a linear array, and 
they are used to hold the channel group numbers. MGIDDR 296 is a data 
register. 

[00149] PM-ext is similar to Pm with the addition of several components, and 
modifying operations. The linear array MGID has r locations, MGIDi, MGID 2r . . ., 
10 MGIDr; each is used to store an integer gid. MGIDr is associated with Mi and 
U MCHi, i.e. a triple (Mi, MCHi, MGIDi) is treated as a superword. Comparand 
ji register MGID C and data register MGIDDR axe added. 

)h-j [00150] Associative processor P M -ext supports the following major operations 

f 1 that are used in the efficient implementation of the LAUC-VF channel scheduling 

H 

f 15 operations. 

f , RANDOM-READ: Given address x in MARi, do MDR <- M x , MCH X 

J <- MCHR and GIDR <- MGID X . 

P RANDOM-WRITE: Given address x in MARi, do M* <- MDR, MCH X 

<- MCHR and MGIDi 4- MGIDDR. 
20 P ARALLEL-SE ARCH1: Simultaneously, MGID C is compared with 

the values of MGID 2, MGIDi, . . ., MCIDr). Find; such that MGID, = MGIDC, and 
do MAR 2 <- j, MDRi <- Mp MCHR <- MCH/, and MGIDDR 4- MGID+j. 

PARALLEL-SEARCH2: Simultaneously, (MC, MGIDC) is compared 
with (Mi, MGIDi), (M 2 , MGID 2), (M r , MGIDr) Find the smallest ; such that M, 
25 < MC and MGIDj =GIDC, and do MARi <- MDRi <^ M ; , MCHR <- MCH;, and 
MGIDDR <- MGIDi. If there does not exist any word (M,, MGIDj) such that M; < 
MC and MGIDj = GIDC, MARi = 0 after this operation. 

SEGMENT-SHIFT-DOWN: Given addresses a in MARi, and b in 
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MARi such that a < b, perform Mf+i M,> MCHj+i <- MCHj and MGIDj+i <- 
MGIDj. for all a <;' < b. 

[00151] For RANDOM-READ, RANDOM-WRITE and SEGMENT-SHIFT- 
DOWN operations, each triple (Mp MCHp MGIDj) is treated as a superword. The 
5 output of PARALLEL-SE ARCH1 consists r binary signals, MFLAGi, l<i<r. 
MFLAG = 1 if and only if MGJDi = MGIDC. There is a priority encoder with 
MFLAG, 1 < z < r, as input, and it produces an address / and this value is loaded 
into MARi when PARALLEL-SEARCH1 operation is completed. The output of 
PARALLEL-SEARCH2 consists r binary signals, MFLAGi, 1 < i < r. MFLAGi = 1 if 
10 and only if Mi < MC and MGIDi = MGIDC. The same priority encoder used in 
Q PARALLEL-SEARCH 1 transforms MFLAGi, 1 < z < r, into an address ; and this 
! 4j value is loaded into MARi when PARALLEL-SEARCH operation is completed. 
J! RANDOM-READ, RANDOM-WRITE, PARALLEL-SEARCH2 and SEGMENT- 
SHIFT-DOWN operations are used to maintain the non-increasing order of 

'PI 

15 values stored in M. 

|4 [00152] Figure 26 illustrates a block diagram of PG-ext. PG-ext 300 includes 

Jj G,GC, GCH, GAR, GDR,GCHR, as described in connection with Figure 21. GGIDC 

J| 302 is a comparand register for holding the gid for comparision. GGID 304 is a 

memory of r words, GGIDi, GGID 2,. . ., GGJD r , with GGJD ; corresponding to Gj 
20 and GCH } . The words are connected as a linear array, and they are used to hold 

the channel group numbers. GGIDR 306 is a data register. 

[00153] Similar to the architecture of PM-ext, a linear array GGID of n words, 
GGID!, GGID2,...,GGID n is added to P G . A quadruple (G i/h G h2 , MCHi, GGIDi) is 
treated as a superword. 

25 [00154] Associative processor PG-ext supports the following major operations 
that are used in the efficient implementation of the LAUC-VF channel scheduling 
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operations. 

RANDOM-WRITE: Given address x in GAR, do G x , h <r- GDRi, G X/2 <- 
GDR 2 , GCH X <r-GCHR, GGID X <- GGIDR. 

PARALLEL-DOUBLE-COMPARAND-SEARCH: The value of (GC, 
5 GGIDC) is compared with {Gi, GGIDi), (G 2 , GGID2), (G n , GGID n ) 

simultaneously (in parallel). Find the smallest; such that Gp < GCi, Gj, 2 > GCi 
and GGIDj = GGIDC. If this operation is successful, then do GDRi <- G ; ,i, GDR 2 
<- G/ /2/ GCHR <- GCHj, GGIDR <- GGID ; - and GAR <-/; otherwise, GAR <- 0. 

PARALLEL-SINGLE-COMPARAND-SEARCH: (GC^GGWC) is 
10 compared with (Gi,i, GGIDi),(Gz,i, GGID2), (G n ,i, GGID n ) simultaneously (in 
f!% parallel). Find the smallest; such that Gj t i > GCi and GGIDj - GGIDC. If this 

operation is successful, then do GDRi <- Gj,i, GDR2 <— Gj,2, GCHR <- GCH } , 
t » GGIDR 4- GGID, and GAR <- j; otherwise, GAR <- 0. 

Sg| BIPARTITION-SHIFT-UP: Given address a in GAR, shift the content 

"& 

£15 of Gj+i to G; <- G,- +3 , GCHj 4- GCHj+i, GCH/ to GC%2, GGID, to GGID ]+h for a <; < 
J 4 n, and G.,.7 <- 0, G n „2 <- 0. 

J* BIPARTITION-SHIFT-DOWN: Given address a in GAR, do G 7+3 4- 

* J G p GCHj+i <r- CCH P GGIDj=i <- GCID ]f a <j < n. 

p'.rS 

[00155] A quadruple (Gt,i, Gi,2, GCH ; , GGIDi) corresponds to a gap with 
20 beginning time G^i and ending time G;,2 on channel CCHi, whose gid is in GGID;. 
For RANDOM-WRITE, PARALLEL-DOUBLE-COMPARAND-SEARCH, 
PARALLEL-SINGLE-COMPARAND-SEARCH, BIPARTITION-SHIFT-UP, and 
BIPARTITION-SHIFT-DOWN operations, each quadruple (G li2 , G ; , 2 , GCHz, 
GGIDi) is treated as a super-word. The output of PARALLEL-DOUBLE- 
25 COMPARAND-SEARCH (resp. PARALLEL-SINGLECOMPARAND-SEARCH) 
operation consists n binary signals, GFLAGi, l<i<n, such that GFLAGi = 1 if and 
only if G/,2 > GC 2 and G il2 < GC 2 (resp. G u > GGj, GGID t =GGIDC. There is a 
priority encoder with GFLAGi 1 < i < n, as input, and it produces an address / 
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and this value is loaded into GAR, when the operation is completed. RANDOM- 
WRITE, PARALLEL-SINGLE-COMPARAND-SEARCH, BIPARTITION-SHIFT- 
UP, and BIP ARTITION-SHIFT-DOWN operations to maintain the non-increasing 
order of values stored in &,is. 

5 [00156] Changing the gid of a channel Qy from gi to gi is done as follows: find 
the triple (Mi, MCH if MGIDi) such that MCHi, = j and store z into MARi and 
(MDR, MCHR, MGIDDR); MGIDDR <- g 2 , and write back (MDR, MCHR, 
MGIDDR) using the address \ in MARi. 

[00157] Given a DB', U M (DB'), t e out (DB'), and a gid & the scheduling of DB' 
*f 10 involves searches in Pm-cxi and PG-ext. Searching in Pu-ext is done as follows: find 

*3 the smallest i such that Mj < t s 0Ut (DB') and MGIDi = g. Searching in Pg-c^ is done 

w 

%l as follows: find the smallest i such that G h i < U° ut (DB'), G,,2 > t s out (DB'), and 
m MGIDi = g. 

J- f [00158] Similar ly /associative processors Pg-ct* and PG*-ext can be constructed 
|N 15 by adding a gid comparand register MGGIDQ a memory MGGID of m words 
JO MGGIDi, MGGID 2 , , MGGlDm, and a data register MGGIDDR. P M G~ext is a 

I > combination of Pu-ext and PG-ext. The operations of PMG-ext can be easily derived 

from the operations of PM-ext and PG-ext since the PM-ext items and the PG-ext items 

are separated. In P*MG-ext, the PM-ext items and the PG-ext items are mixed. Since the 
20 MGi,i values of these items are in non-decreasing order, finding the PM-ext item 

corresponding channel Ch can be carried out by finding the smallest; such that 

MGGIDj = i. 

[00159] Although the Detailed Description of the invention has been directed 
to certain exemplary embodiments, various modifications of these embodiments, 
25 as well as alternative embodiments, will be suggested to those skilled in the art. 
The invention encompasses any modifications or alternative embodiments that 
fall within the scope of the Claims. 
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